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Abstract: White blood cells (WBC), or leukocytes, are 
an essential part of the human immune system, 
constantly protecting the body against viruses, bacteria 
and other foreign invaders. Determining the WBC is 
crucial for curing various diseases, especially leukemia. 
The abnormality of WBC count causes leukemia, failing 
the autoimmune system. Image segmentation, an 
application of pattern recognition techniques, is 
employed in this paper to find the WBC count. WBC are 
identified by the colour difference between their nucleus 
and cytoplasm. Using CNN’s U-net architecture, the cell 
borders of WBC are marked, enabling us to find the area 
of abnormality in the given sample. This method is 
trained on the images in the known training data. The 
model can produce an accuracy of around 87% for 
segmenting WBC from the blood smear. And for the 


3ad7154 @srmist.edu.in final output with the concept of swarm intelligence in 
AI. The WBC count in the image is found with the 
OpenCV method considering optimization purposes. 
Secondly, the system transfers and modifies the model 
with transfer learning models VGG/ResNet and counts 
cells with the deep neural model. The counting model 
can be used for other modelling and application 


purposes. 


Keywords: white blood cells, swarm intelligence, 
transfer learning, image segmentation, HSI, unet. 


1. Introduction 

Pattern recognition means automated recognition of patterns. In the proposed paper, we have used the 
automatic image segmentation technique to separate WBC from the blood smear containing RBCs and 
platelets [37]. The White Blood Cells (WBC) are separated from the blood smear containing platelets 
and RBC. This paper uses the automatic image segmentation technique [38]. WBCs are essential for 
human body defence. They safeguard our body against viruses, bacteria and other dangerous external 
occupants; hence WBCs are called the soldiers of our body. Determining WBC count is crucial for 
curing various diseases, especially leukemia. Leukemia patients have a high WBC count making them 
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victims of autoimmune disease. Hence the count of WBC has to be monitored continuously for 
leukemia patients [39-45]. It also helps understand and monitor the effect of chemotherapy, immunity 
level, and identifying anaemic diseases and other infections [46]. The proposed system aims at finding 
the count of WBC by using HSI colour space transformation, marking areas of infection with U-net 
architecture and tuning the model for cell counts with transfer learning [47-53]. The current system 
uses a microscopic manual process that takes 2 days, or the automated process can only be applied to 
test data. If any abnormality is present, the process must be redone multiple times [54]. 


2. Related Works 

The proposed system in the paper denotes different colour space techniques to denote coloured digital 
images [1]. It applies colour space transformation and further correlates the outcome parallelly in 
Matlab. HSV colour space is better compatible with round pictures or structures, like a cd or clock. 
This paper [2] suggests a colour correction method for differentiating nucleus and cytoplasm [55-66]. 
It transforms RGB to Lab colour space, adjusts input image to template image, then transforms back to 
RGB and proceeds for segmentation. If an application has fewer complex colours, then it’s enough to 
detect extreme colours. [11] proposes a value-based method that which extreme colours have high 
value in only one channel [67-71]. Hence, this method can be applied to detect basic colours. When 
the blood smear image has a similar colour for all blood cells, it is not easy to detect the target. So, the 
image’s hue should be increased, and brightness should be adjusted. [15] proposes a similar HSI 
colour transformation method and CNN for classification [72]. 


A greyscale is developed to differentiate nuclei from cytoplasm and background using a Poisson 
distribution with minimum error thresholding [21]. The cytoplasm is identified using Discrete Wavelet 
Transform (DWT) and morphological filtering. This has more efficient segmentation compared to 
existing models. [29] To retrieve spectral reflectance properties, colour correction and auto white 
balance (AWB) are disemployed. Outcomes of CCD/CMOS sensors give the colour temperature of 
illuminants with large productivity [73-81]. The colour correction matrix converts the values into 
sRGB precise colours. This paper [2] says that architectures like E-net and U-net are used for 
complications of semantic separation of biomedical data as deep neural networks are very costly. U- 
net is less efficient than E-net. But U-net is 1-2% better efficient than E-net accuracy-wise [82-91]. [8] 
The paper says manual methods always have cons compared to automatic methods. Hence automated 
processes are always better efficient than manual processes. The framework for the separation is 
carried out by advanced digital image processing [92-101]. This obtains 92% and 78% accuracy for 
separation of nucleus and cytoplasm, respectively [102]. 


In this paper [12], a network and training strategy depends on the high usage of data augmentation to 
use annotated samples with better performance. It shows that U-net architecture gets efficient 
outcomes in biomedical image segmentation [103]. The training time on Nvidia Titan GPU is 10 
hours. [16] outlines the YOLO (you look only once) layout as a framework that processes detection in 
the area of the high probability of expected output [104-115]. This framework can be fitted for many 
algorithms. One such algorithm is UOLO combining U-net architecture and YOLO that first localizes 
the image as grids and searches for targets with pixel-wise segmentation. [17] The main purpose of 
[18] is to automate the segmentation. It provides two types of results, one that differentiates cells from 
the image called a binary class, and another that classifies the cell into nucleus and cytoplasm called as 
multiclass. The model’s efficiency is evaluated against the intersection over union score (IoU) [116]. 


In this paper [3], the conventional identification procedure uses handcrafted characteristics that have 
restrictions on structures [117-121]. Due to the absence of training data, deep learning models operate 
too slowly. Hence transfer learning is used. The operation for breast cancer has been improved by 
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transfer learning. This paper [5] uses two models from the transfer learning technique [122-127]. The 
first process uses the final fully connected layer for categorization before using a classifier. In the 
second type of process, the higher level called the network fine-tuning is eliminated [128]. Fine-tuning 
can help in getting more accurate outcomes than a separate classifier for bigger data at the network 
level. [7]. In this paper, the retina net is used to develop a 13D Net for lung nodule detection and make 
it more efficient by using transfer learning models 13DR-Net T-LIDC and 13DR-Net T-Image Net 
[129]. The detection performance and time reduction are enhanced by using weight transfer learning 
from natural image datasets [130]. 


This paper categorizes [13] white blood cells and sperms of castles with image processing and 
compares them with different transfer learning models like InceptionResNetV2, Xception, 
DenseNet121, DenseNet169, and MobileNetV1. The transfer learning models obtain efficient 
outcomes for the cattle sperm and WBC segmentation by InceptionResNetV2 and DenseNet121. [25] 
Deep transfer learning models such as AlexNet, Res-Net18, SqueezeNet, GoogleNet, VGG16, and 
VGGI19 which have lesser layers, are tested here for Diabetic Retinopathy (DR) detection. The 
preciseness of 97.9% was attained using the AlexNet model, which also has lesser computational 
complexity and training time [131-145]. [14] uses preprocessing calculations for colour disunion, 
bounding box distortion and image flipping mirroring to evaluate WBC differential counts [146-155]. 
Hierarchy topological feature extraction using inception and ResNet architecture are employed for 
recognition. The detection of bio-medical images faces problems with varying structures of cells [156- 
167]. So, [19] involves the general adversarial network in the transformation process [168-171]. It uses 
DNN networks like ResNet and VGG-16 for training classification and data augmentation with GAN, 
which classifies white blood cells [172]. 


This paper evolves [22] the knowledge attained around the 1970s on the ideas of transfer learning in 
neural networks [173]. Mathematical models and geometric representations of transfer learning result 
from the use of transfer learning in pattern recognition using image datasets. [24] uses pre-trained 
CNN networks to get the features extracted in the input image [174]. This can recognize various 
Indian languages while trained in one dataset of Bangla characters. This gives better results than 
random adjusted weights [175]. This paper drops [4] manual processes and uses computerized systems 
to identify red blood cells [176-181]. Artificial Neural Networks classify normal and abnormal red 
blood cells in blood images. With the help of a green colour channel image and a series of post- 
processing, red blood cells can be categorized from the clusters. [6] This paper explains the method of 
making leukocyte filters. There is considerably less loss of red blood cells while giving real red blood 
cell products with these filters [182-189]. In order to eliminate time taking manual process, [10] 
reviews various papers on automatic WBC segmentation. This tells us which technique can be used for 
which type of image or output [190]. 


A clustered image [20] is classified into nuclei, background and cluster of cells using the SVM 
algorithm. The cytoplasmic shape is developed for overlapping cells using the Space Coding method. 
Distance Regularized Level Set Evolution (DRLSE) is deployed to form a defined shape and obtain 
clear nuclear boundaries [191]. The efficiency [26] of convolution network depth is studied in this 
paper for large-size image recognition. 3x3 convolution filters with increasing depth are the main 
focus of this paper. For training, ConvNets are processed with the help of RGB values to estimate 
pixels. [27] The layers in deep neural networks are reformulated in order to ease the training process 
[192-195]. Using the ImageNet dataset, 8 times deeper than VGG nets, residual nets are evaluated, 
where only a 3.57% error rate is achieved. [28] This paper advances COCO-Text for recognition and 
detection. For diversifying the texts, machine and handwritten texts are fine-grained and classified into 
legible ones with proper transcriptions. Optical Character Recognition (OCR) is also employed. The 
main [30] idea proposed in this paper is that unet produces model results with high satisfaction giving 
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lesser training time and precise output. The colour space changed to the HSV model is done for pre- 
processing [196-199]. The paper also pitches the idea of transfer learning and adhesive cell detection 
for more model accuracy. 


3. System Architecture 


Start 
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Taking input 
images(blood 
smear image) as 
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Building a 
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Comverting input images 
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Figure 1. Input and Colour space module 


The input for the proposed system is blood smear images as a data set. Then the necessary modules are 
imported for building a model (figure 1). With the help of using the HSI colour model, the colour 
transformation is executed. The image is segmented, and the boundaries of WBC cells are marked 
with the help of U-net architecture. The model is then fine-tuned with transfer learning to increase 
efficiency. The data set is now sent, and the result of an image with WBC cells distinguished from 
RBC cells and platelets is obtained (figures 2 and 3). 
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Figure 2. Flowchart for image segmentation. 
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Figure 3. Working on the WBC segmentation model 


4. Methodology 


The model of this system produces three outputs at the end of each module. Firstly, the colour 
conversion from RGB to HSI colour space. Then the segmented cell image from the blood smears. At 
last, the final output with two models counting the cells with OpenCV and transfer learning. 


4.1 HSI colour model 


HSI colour models are more efficient than other models, especially in our proposed system, as it 
responds the same way humans see and describe colours. Humans describe colours by their hue, 
saturation and brightness and not by the percentages of different colours present in one particular 
object. Hue is the colour attribute examples are different colours in their purest form like blue, pink, 
etc. Saturation is the purity of colours like dark blue or light blue. Brightness is the colour intensity, 
whether a self-illuminating or reflecting object. To enhance the image quality, the Intensity component 
has to be changed, while in contrast with other models, all the components have to be changed to get 
the desired outcome (figure 4). 


Figure 4: HSI Color Space Representation[32] 


The saturation component tells the degree to which white is mixed with the purest form. For example, 
consider 2pi is 360 degrees, then null pi is red, 2pi/3 is green, 4pi/3 is blue, pi/3 is yellow, and 5pi/3 is 
magenta. Intensity ranges from 0 to 1, white and black, respectively. 
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Figure 5. HSI Model 


From the above figure 5 [31] above, the angle from the red axis mentioned as H gives a hue. 
Saturation given as S in the figure is the length of the vector. The intensity can be found from the 
vertical axis on which the plane’s position is obtained. 


4.2 U-net 


U-net is a Convolution Neural Network (CNN) architecture employed for precise biomedical image 
segregation. The base notion uses consecutive convolution layers to upsample or interpolate the input. 
An important feature of U-net is that it can localize, separate and accurately identify the input’s 
borders. This is because the architecture classifies every pixel of the input; as a result, the input and 
output share the same size. The base architecture of the U-net is in the shape of a U, thus the name. 
The two essential parts of this symmetrical U-shaped architecture are the left part- the contracting 
path, and the right part- the expansive path. The contracting path contains the general convolution 
process, and the expansive path constitutes the transpose 2D convolution layers. U-net architecture 
learns segmentation on an end-to-end basis. Semantic segmentation is segmentation into a particular 
class based on pixels. The architecture can be an analogy for an encoder-decoder structured network 
(figure 6). 
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Figure 6: Base U-net architecture [12] 


In classification, only the final output of the deep network is the important value. On the other hand, 
semantic segmentation requires discrimination at the pixel level and a technique to project the 
discriminative features learnt at various levels of the encoder into pixel space. In the diagram, we can 
see that the encoder holds the first half of the architecture. Generally, it is a pre-trained set of 
classification networks like the ResNet or VGG where the convolution sets are applied and are a max 
pool downsampling follows it to encode the input value of the image to representation features at 
different levels on the architecture diagram, the last half in the decoder. The decoder aims to 
semantically produce the discriminative features in lower resolution learnt by the encoder and into the 
pixel space, which is the higher resolution, so that that thick classification can be made. The decoder 
includes upsampling and concatenation which is followed by usual convolution operations. 
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4.3 ResNet 


ResNet is short for residual neural networks. ResNet is an artificial neural network (ANN) that works 
on constructs found in the cerebral cortex’s pyramidal cells. This is made possible by ResNets by 
making use of skip connections and also by using shortcuts so that some layers are skipped or jumped 
over. Usually, models of ResNets are made to work with double-layer skips or even triple-layer skips, 
which have nonlinearities (ReLU) and batch normalization in between. In addition, a weight matrix 
can be employed to learn about the weights to skip. Such models are HighwayNets. DenseNets are 
models that do various parallel skips. In residual neural networks, a non-residual network may be 
described as a plain network (figure 7). 


Dense 
Conv1 
3 


Flatten 
64 LayertLayer2 Layer3 Layer4 


128 256 
J 512 | Softmax 


Figure 7: ResNet 34 parameter layers 


It has been extrapolated by many researchers that affirm that the convolution neural networks are as 
better as they get when it is deeper. This is well explained as the capability of the models is increased. 
This means that the model’s flexibility to work in any space is maximized as the spaces have larger 
parameters to explore. However, after a certain depth, the performance quality starts to decrease. 
Another problem that ResNets resolve is the infamous vanishing gradient problem. Since the network 
has high depth, post many applications of the chain rule, the gradients at points where the calculation 
of loss functions easily becomes zero. As a result, the values of weights are not updated. Thus, 
learning does not take place. Using ResNets, the gradients can flow through connections that skip 
backwards from final layers to initial kernels. 


4.4 VGG 


VGG is a Convolution Neural Network which is pre-trained and is the current most efficient method 
for visual image segregation. It is mostly employed in the biomedical field, such as heart rate 
calculation on movement and pavement distress detection. VGG helps identify and classify objects not 
previously known by the network. VGG uses RGB colour images as input and sends them through 
convolution layers with a pre-fixed filter. Thus the pooling filters enable down sample the input image 
(figure 8). 
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Figure 8. VGG Model [33] 
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To say, a VGG Network gets an RGB image with 224x224 and sends it through a stack of 
convolutional layers with the fixed size of the filter of 3x3 and the stride 1. VGG deploys 1x1 
convolutional layers, making it more nonlinear without changing its fields. The only few 
disadvantages of VGG are memory consumption, amount of parameters and computation time (figure 
9). 


PUAI 


224x224x664 


W512  7x7x512 
1x1x4096 1%1x4096 1x1x1000 1*1x1000 


Figure 9. VGG Layered Architecture [34] 
5 Implementation 


The input of the system is a blood smear image. These images are collectively stored as a dataset. 
Generally, for application, publicly available images can be feeded to this dataset. This dataset is 
downloaded as a local file on the device. The model is built in python code and runs in local python 
ide. In the first step, necessary modules are imported, and the dataset is linked to the file. Then 
following steps are executed for desired output. The input image in the RGB model is converted into 
the HSI colour space for better enhancement of the image [35] as in HSI, and luminous intensity can 
be ignored. Here the colour of the nucleus and cytoplasm is varied. After transformation, the 
convolutional network’s U-net architecture is applied. This architecture segments the image for the 
target in downsampling while upsampling marks the boundary of the target, which is useful for 
biomedical images. This model is then trained with training data. The count is then identified with 
OpenCV in python. The system then fine-tunes the model with VGG and ResNet to count and can be 
extended to diagnose or classify WBC. To analyze the cell count further, the VGG can be used as it 
compares different parameters and says the condition of leukemia. ResNet in transfer learning can 
classify WBC to its types. ResNet architecture takes each layer as a learning module, so it references 
the method instead of the image, classifying unknown data.[27] When the lab’s need varies, the 
model’s usage also varies. If the labs wish to detect only leukemia then the count of WBC is enough. 
So for optimization, OpenCV can be used, and test results can be obtained soon. When the type of 
WBC is needed, then as swarm intelligence of the system, another module called transfer learning 
helps extend further. 


5.1 Conversion of RGB to HSI 


The image from the dataset is selected, and HSI conversion is applied. The transformation process 
follows the following method. The image’s RGB values are calculated and depicted in the [0,1] range. 
Then all three components, intensity, saturation and hue, must be found. The intensity is the average 
value of colour components RGB. Saturation is based on the minimum of these colour values. Hue is 
represented as theta, the inverse of the cosine function given by Rafael Gonzalez and Richard Woods 
(figure 10). [36] 


S = 1 - (3/N)*[min(r,g,b)] (1) 
N = R+G+B (2) 

For hue, 
H = theta, where G>=B (3) 


602 | Published by “ CENTRAL ASIAN STUDIES" http://www.centralasianstudies.org 


Copyright (c) 2022 Author (s). This is an open-access article distributed under the terms of Creative Commons 
Attribution License (CC BY).To view a copy of this license, visit https://creativecommons.org/licenses/by/4.0/ 


CAJMNS Volume: 03 Issue: 03 | May- Jun 2022 


H = 360-theta where G<B (4) 


Figure 10. Image in HSI 
5.2 Image segmentation with U-net 


U-net is a fully convolutional network. When an image is presented, it is reduced into small sizes or 
localized. First, object detection happens and is then classified into two classes, each one as target cell 
and environment. 


Step 1 - Taking a full-size input image 

Step 2 - Applying convolutional layers. 

Each convolutional layer takes on these transformations. 

Step 3 - Applying k height filter to get a feature map of magnitude nxnxk 
Step 4 - Extracting a convolutional layered matrix called a map. 

Step 5 - Max Pooling with 2x2 for precise feature map. 


The above steps are called downsampling. For bottleneck or skip connection, the final feature map is 
doubled. For upsampling, the image is retraced with a feature vector looking for the object’s location. 
In due process, the image restores its original size. 


Step 6 - Creating a convolution matrix of size related to the map. 
Step 7 - Producing transposed CN and increasing the size of the latest feature map. 


Step 8 - Above steps are repeated until they reach the original image’s size. 


Feature map 


Figure 11. Convolution layer in down sampling 


Hence, in this phase, when a blood image is presented, it continuously searches for the indicated cell 
as the WBC nucleus has the distinct colour of violet visible after transformation. When the nucleus is 
marked and is separated from the background in downsampling, the cell outline is marked in 
upsampling, as shown in figure 11. 


5.3 Transfer learning 


In this module, the identified output image is passed as input to another model for further prediction. 
The feature vector, like a set of conditions listed or parameters given, is measured. When an external 
image arrives not from a given source, the resnet checks for the similar features as required by the 
model rather than a pixel-wise comparison of images; VGG adds the additional layers with scrutinized 
features to be selected that can be lengthened upto 16-19 layers. It has filters of 1x1 size for 
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convolution that gives linear output. The Resnet uses residual network modules that allow skipping a 
few layers of convolution when an image is found to be polluted by that layer (figure 12). 


Fig 12. WBC segmented from the background 


As for this application, this module transfer learning is carried on by one of the model resnet or vgg, 
where either of them looks for important features like actual clustered cell boundary and cytoplasm. 
The additional layered training can find significant differences in normal blood cell images. Resnet 
can skip layers that tweak the output. These may be the layers where the model still looks for non- 
valued edges and applies the training recurrently on the executed and final output. 


6 Results and Discussions 


The model built uses python libraries such as PyTorch and numpy. To fully utilize the benefit of the 
design, the system must have a GPU. The model can be run in python ide. In the first step of the 
segmentation, the enhancement of the image called colour transformation happens. This conversion 
from RGB to HSI makes the visibility of cells better (figures 13 to 15). 
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Figure 15. Intensity values for RGB 
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The considered RGB values are (table 1), 
Table 1. Variations of hue, saturation and intensity with rgb 
Red | Green | Blue | H S I 
0 0 0 0 0 0 
12 15 24 | 226 | 0.294 | 0.067 
70 39 55 | 329 | 0.287 | 0.214 
72 150 60 | 113 | 0.362 | 0.369 
92 73 231 | 246 | 0.447 | 0.518 
187 43 78 | 347 | 0.581 | 0.403 


The biomedical images are not readily available, so the given must be used optimally. Unlike other 
architectures, U-net takes learnable parameters for the next layers to get precise information about the 
image. It leaves out edges for the environment. So it is tailor-made for smaller datasets. With transfer 
learning, the resnet replaces the output layer with feature vector so-called functions. This works well 
for normal cells and linear arranged cells. For normal clusters, it identifies the clustered and not 
overlapped cells. The model can get linearly arranged cells for abnormal cell images, and the adhesive 
cells due to leukemia may mislead and reduce accuracy. With the abnormal clustered cells, the model 
can get upto 87% of the given data. 


7 Conclusion and Future Works 


The paper works to segment the WBC cells from the image. The first process is to change colour space 
to HSI, then image segmentation with CNN’s U-net and transfer learning with resnet. It concludes the 
WEC cells can be identified by automatic image segmentation. These cells can be counted and thus 
helps in the detection of leukemia. And finally, using two methods to count the cell where one 
compensates for another under swarm intelligence optimizes the model and increases the system’s 
modelling feature. When transfer learning is used, the time complexity of the model is reduced. In the 
future, the model’s efficiency can be increased with other transfer learning models like the Inception 
series and large training data. 
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